首页> 外文OA文献 >Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates

【2h】

Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates

机译：异步机器人机器人操纵的深度强化学习非政策更新

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Reinforcement learning holds the promise of enabling autonomous robots tolearn large repertoires of behavioral skills with minimal human intervention.However, robotic applications of reinforcement learning often compromise theautonomy of the learning process in favor of achieving training times that arepractical for real physical systems. This typically involves introducinghand-engineered policy representations and human-supplied demonstrations. Deepreinforcement learning alleviates this limitation by training general-purposeneural network policies, but applications of direct deep reinforcement learningalgorithms have so far been restricted to simulated settings and relativelysimple tasks, due to their apparent high sample complexity. In this paper, wedemonstrate that a recent deep reinforcement learning algorithm based onoff-policy training of deep Q-functions can scale to complex 3D manipulationtasks and can learn deep neural network policies efficiently enough to train onreal physical robots. We demonstrate that the training times can be furtherreduced by parallelizing the algorithm across multiple robots which pool theirpolicy updates asynchronously. Our experimental evaluation shows that ourmethod can learn a variety of 3D manipulation skills in simulation and acomplex door opening skill on real robots without any prior demonstrations ormanually designed representations.

机译：强化学习有望使自主机器人能够以最少的人工干预来学习大量的行为技能。然而，强化学习的机器人应用通常会损害学习过程的自主性，从而有利于实现实际物理系统所必需的训练时间。这通常包括引入手工设计的策略表示和人工提供的演示。深度强化学习通过训练通用神经网络策略缓解了这种限制，但是由于其明显的高样本复杂性，直接深度强化学习算法的应用到目前为止仅限于模拟设置和相对简单的任务。在本文中，我们演示了基于深度Q功能的非策略训练的最新深度强化学习算法可以扩展到复杂的3D操作任务，并且可以有效地学习深度神经网络策略以训练实际的物理机器人。我们证明，通过在多个异步存储策略更新的机器人之间并行化算法，可以进一步减少训练时间。我们的实验评估表明，我们的方法可以在模拟中学习各种3D操作技能，并可以在真实的机器人上学习复杂的开门技能，而无需事先进行任何演示或手动设计的表示形式。

著录项

作者
Gu, Shixiang; Holly, Ethan; Lillicrap, Timothy; Levine, Sergey;
展开▼
作者单位

展开▼
年度 2016
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. Deep reinforcement learning with smooth policy update: Application to robotic cloth manipulation [J] . Tsurumine Yoshihisa, Cui Yunduan, Uchibe Eiji, Robotics and Autonomous Systems . 2019,第期

机译：深度加强学习，具有顺利的政策更新：在机器人布操控中的应用
2. Scalable Deep Reinforcement Learning for Vision-Based Robotic Manipulation [J] . Dmitry Kalashnikov, Alex Irpan, Peter Pastor, JMLR: Workshop and Conference Proceedings . 2018,第2010期

机译：基于视觉的机器人操纵的可扩展深度增强学习
3. Applications of asynchronous deep reinforcement learning based on dynamic updating weights [J] . Zhao Xingyu, Ding Shifei, An Yuexuan, Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2019,第2期

机译：基于动态更新权重的异步深增强学习的应用
4. Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates [C] . Shixiang Gu, Ethan Holly, Timothy Lillicrap, IEEE International Conference on Robotics and Automation . 2017

机译：通过异步取消策略更新进行机器人操纵的深度强化学习
5. On and Off-policy Deep Imitation Learning for Robotics [D] . Laskey, Michael. 2018

机译：机器人学的禁止政策深度模仿学习
6. Learning for a Robot: Deep Reinforcement Learning Imitation Learning Transfer Learning [O] . Jiang Hua, Liangcai Zeng, Gongfa Li, 2021

机译：学习机器人：深增强学习仿制学习转移学习
7. Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates [O] . Gu, Shixiang, Holly, Ethan, Lillicrap, Timothy, 2016

机译：异步机器人机器人操纵的深度强化学习非政策更新

Deep Reinforcement Learning for Robotic Manipulation with Asynchronous Off-Policy Updates

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅